Already on the 23-24th of September, we will have a hands-on workshop with Akira Murakami from the University of Birmingham, who will introduce methods for retrieving complex patterns from a corpus using R.
Akira will introduce key functions from the tidyverse, particularly the stringr package, focusing on regular expressions (regex) to identify text patterns, part-of-speech annotation for extracting linguistic structures, and constituency parsing with Tregex to identify patterns in parsed texts. Time permitting, Akira will demonstrate using these techniques to calculate syntactic complexity with the L2 Syntactic Complexity Analyzer.
The workshop is on-location only and will occur in room Lossi 3-425 on Monday, 23.09, at 14:15-15:45 and Tuesday, 24.09, at 12:15-13:45.
More information on MEDAL's website.
Events are organised by the Methodological Excellence in Data-Driven Approaches to Linguistics (MEDAL) consortium and are financed by the EU Horizon Europe programme (101079429) and UK Research and Innovation organisation (101079429).